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Abstract 


Formal  verification  by  model  checking  has  the  potential  to  produce  major  enhancements  in  the 
reliability  and  robustness  of  software,  ffowever,  a  shortcoming  in  most  model  checking  research  is  the 
failure  to  consider  how  to  make  the  use  of  model  checking  routine  throughout  various  stages  of 
software  development.  This  report  presents  results  of  the  Independent  Research  and  Development 
(IRAD)  project  on  verification  of  evolving  software  conducted  at  the  Software  Engineering  Institute  in 
2005.  The  research  conducted  as  part  of  the  IRAD  project  considered  ways  to  reduce  the  effort  of 
subsequent  verifications.  In  particular,  it  resulted  in  the  development  of  techniques  that  exploit  the 
results  of  previous  verification  efforts  and  focus  only  on  the  portions  of  the  system  that  have  changed 
(components).  Thus,  these  new  techniques  incorporate  model  checking  into  development  processes  in 
a  much  less  intrusive  or  cumbersome  manner  than  previous  verification  techniques. 

The  report  presents  an  automated  and  compositional  procedure  to  solve  the  component  substitutability 
problem.  The  solution  contributes  two  techniques  for  checking  the  correctness  of  software  upgrades: 
(1)  a  technique  based  on  simultaneous  use  of  overapproximations  and  underapproximations  obtained 
via  existential  and  universal  abstractions  and  (2)  a  dynamic  assume-guarantee  reasoning  algorithm  in 
which  previously  generated  component  assumptions  are  reused  and  altered  “on  the  fly”  to  prove  or 
disprove  the  global  safety  properties  on  the  updated  system.  When  upgrades  are  found  to  be 
non-substitutable,  the  solution  generates  constructive  feedback  that  shows  developers  how  to  improve 
the  components.  The  substitutability  approach  has  been  implemented  and  validated  in  the  Component 
Formal  Reasoning  Technology  (COMFORT)  model  checking  tool  set.  The  experimental  evaluation  of 
an  industrial  benchmark  demonstrates  encouraging  results. 
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1  Introduction 


Correctness  of  computer  software  is  critical  in  today’s  information  society,  especially  for  software  that 
runs  on  computers  embedded  in  our  transportation  and  communication  infrastructure.  Examples  of 
serious  software  errors  are  easy  to  find.  For  instance,  in  1997,  the  propulsion  system  of  the  Aegis 
missile  cruiser  USS  Yorktown  failed  for  over  two  hours  due  to  a  software  bug  [Slabodkin  98].  The 
cause  turned  out  to  be  a  division  by  zero  within  a  database  system,  which  resulted  in  an  exception  and 
a  crash  of  all  computer  consoles  and  terminal  units.  The  software  of  the  USS  Yorktown  operated  on  a 
network  of  Windows  NT  machines  and  was  quite  complex,  consisting  of  several  million  lines  of  C 
code. 

Another  instance  is  the  development  of  the  F/A-22  as  part  of  the  Joint  Strike  Fighter  program.  The 
project  was  delayed  multiple  times,  and  often  the  project’s  delay  was  caused  by  the  inability  of 
software  developers  to  produce  bug-free  software  for  the  F/A-22.  Pilots  often  had  to  reboot  computers 
while  in  the  air  [U.S.  Govt.  05,  Nellemann  94].  The  F/A-22  has  about  2.5  million  lines  of  software 
written  in  Ada.  This  number  is  expected  to  rise  to  6  million  lines  of  C/C++  code  on  the  F-35. 

Computer  software  also  plays  an  important  role  in  other  parts  of  our  infrastructure.  On  August  14, 
2003,  a  blackout  affected  more  than  50  million  people  in  large  areas  on  the  U.S.  east  coast,  causing  an 
estimated  damage  between  $4  billion  and  $10  billion  [U.S. -Canada  04].  While  the  blackout  was 
triggered  by  trees  hitting  local  power  transmission  lines,  a  software  bug  made  the  damage  devastating. 
A  bug  in  General  Electric  (GE)  Energy’s  XA/21  power  control  system  allowed  the  blackout  to  spread. 
The  software  had  been  in  use  since  1990,  but  the  bug  had  not  become  apparent  previously.  The  flaw 
was  discovered  by  an  audit  of  over  4  million  lines  of  C/C++  code  after  the  blackout  and  was  identified 
as  a  “race  condition.” 

Programs  in  imperative  languages  like  C  or  C++  are  executed  line-by-line  in  what  is  called  a  thread  of 
control.  It  is  tempting  to  hope  that  a  line-by-line  inspection  of  the  code,  following  this  thread  of 
control,  will  uncover  all  the  flaws  in  a  program.  The  problem  is  that  complex  systems  have  many 
software  components  running  in  parallel,  so  there  are  many  different  threads  of  control  that  run 
simultaneously.  While  one  of  these  threads  may  be  executing  some  statement  in  its  program,  another 
thread,  with  exactly  the  same  program,  may  be  executing  an  entirely  different  line  of  code 
concurrently.  Consequently,  in  the  presence  of  multiple  threads,  any  combination  of  program  lines 
that  the  threads  can  execute  must  be  considered. 

The  state  of  the  program  is  the  location  of  the  control  in  each  thread  and  the  values  of  the  program 
variables.  To  discover  flaws,  the  possible  states  of  the  program  must  be  explored.  To  illustrate  the 
large  number  of  states  that  concurrency  can  cause,  consider  the  small  program  in  Figure  1 .  It  has  one 
variable  x,  which  is  initialized  with  zero.  It  has  two  threads  (A  and  B  )  of  control  and  only  four  lines  of 
code  in  total.  The  first  line  in  both  threads  simply  idles  until  x  becomes  zero.  The  second  line  sets  x  to 
1  or  2,  respectively.  Despite  its  tiny  size,  the  program  has  10  reachable  states.  The  explosion  in  the 
number  of  reachable  states  is  due  to  the  different  combinations  of  program  locations  in  the  two  threads 
A  and  B.  Thus,  a  manual  search  for  errors  in  large  concurrent  programs  is  infeasible. 

Model  checking  is  an  automated  technique  for  the  exploration  of  all  the  states  of  a  system 
[Clarke  82,  Clarke  00b].  Introduced  in  1981,  it  has  become  a  standard  verification  technique  in  the 
hardware  industry.  It  has  been  successfully  used  to  find  bugs  in  circuitry  that  would  have  been  hard  to 
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find  by  inspection  alone. 


Thread  A  Thread  B 

1  while(x!=0)  skip;  1  while(x!=0)  skip; 

2  x=l;  2  x=2; 

3  3 

Figure  1:  A  Small  Program  with  Two  Threads  of  Control 

Model  checking  also  has  the  potential  to  produce  major  enhancements  in  the  reliability  and  robustness 
of  software.  The  basic  idea  of  software  model  checking  is  to  explore  all  the  states  of  the  software 
system  systematically.  The  states  are  checked  for  errors.  Such  an  error  may  be  division  by  zero  as  in 
the  case  of  the  USS  Yorktown,  a  race  condition  as  in  the  case  of  GE’s  XA/21,  or  a  violated  assertion. 
Once  such  an  erroneous  state  is  found,  it  can  be  reported  to  the  programmer  together  with  a 
counterexample  (i.e.,  an  error  trace),  which  demonstrates  the  flaw.  Counterexamples  can  be  very 
helpful  for  understanding  the  nature  of  the  error  and  fixing  it. 

However,  the  effectiveness  of  the  model  checking  of  such  systems  is  severely  constrained  by  the  state 
space  explosion  problem  (by  the  sheer  number  of  states  a  program  can  be  in).  If  there  are  too  many 
states,  it  becomes  impossible  to  explore  all  of  them,  even  on  a  powerful  computer. 

Much  of  the  research  in  this  area  is  therefore  targeted  at  reducing  the  state  space  of  the  model  used  for 
verification.  One  principal  method  in  state  space  reduction  of  software  systems  is  abstraction. 
Abstraction  techniques  reduce  the  program  state  space  by  generating  a  smaller  set  of  states  in  a  way 
that  preserves  the  relevant  behaviors  of  the  system.  Abstractions  are  most  often  performed  in  an 
informal,  manual  manner  and  require  considerable  expertise. 

Manual  abstraction  is  error  prone  too.  The  person  performing  the  abstraction  will  often  capture  the 
intended  behavior  when  abstracting  and  not  the  behavior  of  the  actual  code.  Thus,  a  bug  could  be 
hidden  in  the  code.  Industrial  applications  of  model  checking  therefore  favor  automated  ways  to 
compute  the  abstract  model.  One  such  method,  called  predicate  abstraction  [Graf  97,  Colon  98],  has 
proven  to  be  particularly  successful  when  applied  to  large  software  programs.  We  exploited  predicate 
abstraction  while  developing  a  solution  to  the  problem  of  establishing  the  correctness  of  evolving 
systems.  We  describe  predicate  abstraction  and  its  application  to  verification  of  evolving  software  in 
Section  3.2. 

The  other  principal  approach  in  reducing  the  state  space  of  the  verifiable  model  is  compositional 
reasoning.  Compositional  reasoning  partitions  verification  into  checks  of  individual  modules,  while 
the  global  correctness  of  the  composed  system  is  established  by  constructing  a  proof  outline  that 
exploits  the  modular  structure  of  the  system.  We  used  the  assume-guarantee  style  of  compositional 
reasoning  to  support  verification  of  evolved  systems  [Pnueli  85].  We  describe  the  assume-guarantee 
reasoning  paradigm  and  its  application  to  verification  of  evolving  software  in  Section  3.3. 

In  this  document,  we  describe  a  particular  model  checking  problem,  namely  verification  of  evolving 
software.  The  rest  of  the  document  is  organized  as  follows:  Section  2  provides  some  background 
information  on  the  model  checking  technology,  the  types  of  claims  it  can  analyze,  and  the  current  state 
of  research  and  practice  of  model  checking.  Section  3  describes  the  problem  of  verification  of 
evolving  systems  and  presents  a  detailed  description  of  the  techniques  that  we  have  developed  to 
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overcome  difficulties  in  the  verification  of  evolving  programs.  Section  4  provides  an  overview  of 
related  work,  and  Section  5  summarizes  the  contributions  of  the  Independent  Research  and 
Development  (IRAD)  project. 
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2  Model  Checking 


In  formal  verification,  a  system  is  modeled  mathematically,  and  its  specification  (also  called  a  claim  in 
model  checking)  is  described  in  a  formal  language.  When  the  behavior  in  a  system  model  does  not 
violate  the  behavior  specified  in  a  claim,  the  model  satisfies  the  specification.  Model  checking 
[Clarke  82]  is  a  fully  automated  form  of  formal  verification  that  uses  algorithms  that  check  whether  a 
system  satisfies  a  desired  claim  through  an  exhaustive  search  of  all  possible  executions  of  the  system. 
The  exhaustive  nature  of  model  checking  renders  the  typical  testing  question  of  adequate  coverage 
unnecessary. 

Model  checking  is  a  technique  for  verifying  finite-state  concurrent  systems.  One  benefit  of  this 
restriction  to  finite-state  systems  is  that  verification  can  be  performed  automatically.  Given  sufficient 
resources,  model  checking  always  terminates  with  a  “yes”  or  “no”  answer.  Moreover,  it  can  be 
implemented  by  algorithms  that  have  reasonable  efficiency  and  that  can  be  run  on  moderate-sized 
machines. 

Although  the  restriction  to  finite-state  systems  may  seem  to  be  a  major  disadvantage,  model  checking 
is  applicable  to  several  important  classes  of  systems  [Clarke  00b].  Hardware  controllers  are 
finite-state  systems,  as  are  many  communication  protocols.  Software,  which  is  not  finite  state,  can  still 
be  verified  if  variables  are  assumed  to  be  defined  over  finite  domains.  This  assumption  does  not 
restrict  the  applicability  of  model  checking  because  many  interesting  behaviors  of  the  software 
systems  can  be  specified  with  finite-state  models.  For  example,  systems  with  unbounded  message 
queues  can  be  verified  by  restricting  the  size  of  the  queues  to  a  small  number  such  as  two  or  three. 

In  classical  model  checking,  systems  are  modeled  mathematically  as  state  transition  systems,  and 
claims  are  specified  using  temporal  logic  [Pnueli  77,  Clarke  86].  Temporal  logic  is  used  to  define 
formulas  that  describe  system  behavior  over  time,  where  the  propositions  of  the  logic  are  behaviors  of 
interest  involving  state  information  (current  state  or  values  of  variables)  or  events.  Temporal  logic 
formulas  combine  such  propositions  with  temporal  operators  to  describe  interesting  patterns  of 
propositions  over  time,  such  as  the  following: 

•  Whenever  X  is  greater  than  Y,  Z  must  also  be  greater  than  Y. 

•  Some  invariant  (e.g.,  mutual  exclusion  with  respect  to  some  resource)  always 
holds  once  initialization  is  complete. 

•  A  component  can  issue  requests  only  during  an  allowed  interval  (as  bounded  by 
events  granting  and  taking  away  permission). 

Temporal  logic  model  checking  is  extremely  useful  in  verifying  the  behavior  of  systems  composed  of 
concurrent  processes  or  interacting  nondeterministic  sequential  tasks.  Concurrency  errors  (as  well  as 
errors  caused  by  the  nondeterministic  execution  of  actions)  are  among  the  most  difficult  to  find  by 
testing  because  they  tend  to  be  irreproducible. 
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2.1  The  Process  of  Model  Checking 

Model  checking  involves  the  following  steps: 

1.  The  system  is  modeled  using  the  description  language  of  a  model  checker, 
producing  a  model  M. 

2.  The  claim  to  check  is  defined  using  the  specification  language  of  the  model 
checker,  producing  a  temporal  logic  formula  <f). 

3.  The  model  checker  automatically  checks  whether  M  (=  4>  (i.e.,  whether  M 
satisfies  cj>). 

The  model  checker  checks  all  system  executions  captured  by  the  model  and  produces  the  answer 
“yes”  as  output  if  the  claim  holds  in  the  model  (M)  and  the  answer  “no”  otherwise.  When  a  claim  is 
not  satisfied,  most  model  checkers  produce  a  counterexample  of  system  behavior  that  causes  the 
failure.  A  counterexample  defines  an  execution  trace  that  violates  the  claim.  Counterexamples  are  one 
of  the  most  useful  features  of  model  checking,  as  they  allow  users  to  understand  quickly  why  a  claim 
is  not  satisfied. 


2.2  Current  Research  in  Software  Model  Checking 

Model  checking  is  efficient  in  hardware  verification,  but  applying  it  to  software  is  complicated  by 
several  factors,  ranging  from  the  difficulty  of  modeling  computer  systems  (due  to  the  complexity  of 
programming  languages  as  compared  to  hardware  description  languages)  to  difficulties  in  specifying 
meaningful  claims  for  software  using  the  usual  temporal  logical  formalisms  of  model  checking.  The 
most  significant  limitation,  however,  is  the  state  space  explosion  problem  (which  applies  to  both 
hardware  and  software),  whereby  the  complexity  of  model  checking  becomes  prohibitive. 

State  space  explosion  results  from  the  fact  that  the  size  of  the  state  transition  system  is  exponential  in 
the  number  of  variables  and  concurrent  units  in  the  system.  When  the  system  is  composed  of  several 
concurrent  units,  its  combined  description  may  lead  to  an  exponential  explosion  as  well.  The  state 
space  explosion  problem  is  the  subject  of  most  model  checking  research. 

The  following  state  space  reduction  techniques  are  commonly  used  during  verification  of  software: 

•  Compositional  reasoning:  Verification  is  partitioned  into  checks  of  individual 
modules,  while  the  global  correctness  of  the  composed  system  is  established  by 
constructing  a  proof  outline  that  exploits  the  modular  structure  of  the  system. 

•  Abstraction:  A  smaller  abstract  system  is  constructed  such  that  the  claim  holds 
for  the  original  system  if  it  holds  for  the  abstract  system. 

•  Counterexample-guided  abstraction  refinement:  Abstracted  systems  are 
refined  iteratively  using  information  extracted  from  counterexamples  until  an  error 
is  found  or  it  is  proven  that  the  system  satisfies  the  verification  claim. 


6 


CMU/SEI-2005-TR-008 


2.2.1  Compositional  Reasoning 


Because  model  checking  was  created  to  verify  hardware  systems  and  because  most  hardware  designs 
have  a  natural  division  into  modules,  the  extension  of  model  checking  to  larger  designs  is  often 
achieved  by  taking  a  “divide  and  conquer”  approach.  More  specifically,  the  verification  claim  for  a 
system  is  first  decomposed  into  a  set  of  local  claims,  one  for  each  system  module.  These  local  claims 
are  then  verified  separately.  The  compositional  approach  establishes  whether  for  given  systems  M 1 
and  M2  and  a  claim  T,  the  composed  system  satisfies  T  (written  Ml  ||  M2  |=  T).  A  naive 
compositional  approach  proceeds  by  executing  the  following  steps:  (1)  Ml  (=  T  and  (2)  M2  |=  T 
and  concludes  by  proofs  that  Ml  ||  M2  |=  T.  Although  this  rule  is  sound  in  theory,  it  is  often  not 
useful  in  practice.  Usually,  both  Ml  and  M2  behave  like  T  only  in  a  suitable  environment.  To  solve 
this  problem,  the  compositional  principle  can  be  strengthened  to  an  assume-guarantee  principle 
[Abadi  95,  Alur  96,  Clarke  89,  Kurshan  95,  McMillan  97]:  in  order  to  check  M  |=  T,  it  suffices  to 
check  both  Ml  ||  T2  (=  T1  and  M2  ||  T1  |=  T2.  This  obligation  uses  the  local  specifications  T1  and 
T 2  as  the  constraining  environment  (also  called  assumptions )  with  regard  to  the  behavior  of  M2  and 
Ml  taken  in  isolation  from  Ml  and  M2,  respectively.  In  general,  for  a  system  composed  of  multiple 
modules,  assume-guarantee  reasoning  succeeds  only  if  it  can  be  shown  that  each  system  component 
Mi  satisfies  a  corresponding  specification  component  7j  under  a  suitable  constraining  environment. 

2.2.2  Abstraction 

Abstraction  is  one  of  the  principal  techniques  for  reducing  the  complexity  of  a  verification 
problem  [Ball  01,  Clarke  92,  Kurshan  95].  Abstraction  techniques  reduce  the  state  space  by  mapping 
the  concrete  set  of  actual  system  states  to  an  abstract  set  of  states  that  preserve  the  actual  system’s 
behavior.  Abstractions  are  usually  performed  in  an  informal,  manual  manner  and  require  considerable 
expertise.  Predicate  abstraction  [Graf  97,  Colon  98]  is  one  of  the  most  popular  and  widely  applied 
methods  for  the  systematic  abstraction  of  systems.  It  maps  concrete  data  types  to  abstract  data  types 
through  predicates  over  the  concrete  data.  However,  the  computational  cost  of  the  predicate 
abstraction  procedure  may  be  too  high,  making  generation  of  a  full  set  of  predicates  for  a  large  system 
infeasible. 

In  practice,  the  number  of  computed  predicates  is  bounded,  and  model  checking  is  guaranteed  to 
deliver  sound  results  within  this  bound.  The  bound  limit  is  increased  when  errors  (if  any)  are  found 
within  the  bound  and  fixed.  Under  this  approach,  software  systems  are  rendered  finite  by  restricting 
variables  to  finite  domains.  As  mentioned  earlier,  bounded  model  checking  does  not  seriously  restrict 
the  applicability  of  model  checking,  since  many  interesting  behaviors  of  software  systems  can  be 
specified  using  bounded  finite-state  models. 

The  abstract  program  is  created  using  existential  abstraction  [Clarke  92].  This  method  defines  the 
transition  relation  of  the  abstract  program  so  it  is  guaranteed  to  be  a  conservative  overapproximation 
of  the  original  program,  with  respect  to  the  set  of  given  predicates.  The  use  of  a  conservative 
abstraction,  as  opposed  to  an  exact  abstraction,  produces  considerable  reductions  in  the  state  space. 
The  drawback  of  the  conservative  abstraction  is  that  when  model  checking  of  the  abstract  program 
fails,  it  may  produce  a  counterexample  that  does  not  correspond  to  a  concrete  counterexample.  Such  a 
counterexample  is  usually  called  spurious.  When  a  spurious  counterexample  is  encountered, 
refinement  is  performed  by  adjusting  the  set  of  predicates  in  a  way  that  eliminates  it. 
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2.2.3  Counterexample-Guided  Abstraction  Refinement  (CEGAR) 

Although  conservative  abstraction  procedures  (which  ensure  that  if  a  claim  holds  for  the  abstract 
system,  it  also  holds  for  the  original  system)  are  typically  used,  any  form  of  abstraction  may  introduce 
behaviors  not  found  in  the  concrete  system.  Counterexamples  from  model  checking  the  abstract 
system  are  often  used  to  detect  unrealistic  behaviors  and  refine  the  system.  Repeatedly  refining  the 
abstractions,  however,  may  introduce  additional  behaviors  that  result  in  state  space  explosion  during 
the  model  checking  phase.  These  drawbacks  (coupled  with  the  potential  effectiveness  of  abstraction 
methods)  motivated  research  into  targeted  abstractions  (i.e.,  control  abstraction,  loop  abstraction,  and 
so  forth),  which  can  result  in  more  accurate  abstract  systems. 

The  abstraction  refinement  process  has  been  automated  by  the  CEGAR  paradigm 
[Kurshan  95,  Ball  00,  Clarke  00a,  Das  01].  The  CEGAR  framework  is  shown  in  Figure  2:  one  starts 
with  a  coarse  abstraction  (for  example,  an  abstraction  of  a  C  program).  If  an  error  trace  reported  by 
the  model  checker  is  not  realistic,  the  error  trace  is  used  to  refine  the  abstract  program,  and  the  process 
proceeds  until  no  spurious  error  traces  can  be  found.  The  actual  steps  of  the  loop  follow  the 
abstract- verify -refine  paradigm  and  depend  on  the  abstraction  and  refinement  techniques  used. 


Figure  2:  The  CEGAR  Framework 

The  steps  are  described  below  in  the  context  of  predicate  abstraction. 

1.  Program  abstraction:  Given  a  set  of  predicates,  a  finite-state  model  is  extracted 
from  the  code  of  a  software  system,  and  the  abstract  transition  system  is 
constructed. 

2.  Verification:  A  model  checking  algorithm  is  run  to  check  whether  the  model 
created  by  applying  predicate  abstraction  satisfies  the  desired  behavioral  claim  ip. 
If  the  claim  holds,  the  model  checker  reports  success  (<p  is  true),  and  the  CEGAR 
loop  terminates.  Otherwise,  the  model  checker  extracts  a  counterexample,  and  the 
computation  proceeds  to  the  next  step. 

3.  Counterexample  validation:  The  counterexample  is  examined  to  determine 
whether  it  is  spurious.  This  examination  is  done  by  simulating  the  (concrete) 
program  using  the  abstract  counterexample  as  a  guide,  to  find  out  if  the 
counterexample  represents  actual  program  behavior.  If  this  is  the  case,  the  bug  is 
reported  (<p  is  false),  and  the  CEGAR  loop  terminates.  Otherwise,  the  CEGAR 
loop  proceeds  to  the  next  step. 
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4.  Predicate  refinement:  The  set  of  predicates  is  changed  to  eliminate  the  detected 
spurious  counterexample  and  possibly  other  spurious  behaviors  introduced  by 
predicate  abstraction.  Given  the  updated  set  of  predicates,  the  CEGAR  loop 
proceeds  to  Step  1 . 

The  efficiency  of  this  process  depends  on  the  efficiency  of  the  program  abstraction  and  predicate 
refinement  procedures.  While  program  abstraction  focuses  on  constructing  the  transition  relation  of 
the  abstract  program,  the  focus  of  predicate  refinement  is  to  define  efficient  techniques  for  choosing 
the  set  of  predicates  in  a  way  that  eliminates  spurious  counterexamples.  In  both  areas  of  research,  low 
computational  cost  is  a  key  factor  because  it  enables  the  application  of  model  checking  to  the 
verification  of  realistic  programs. 

This  report  presents  techniques  that  use  efficient  abstraction  and  abstraction-refinement  techniques  of 
the  CEGAR  loop  by  employing  techniques  implemented  in  the  COPPER  model  checker  [Chaki  05c]. 
In  this  report,  we  present  a  solution  to  the  model  checking  problem  that  arises  during  verification  of 
evolving  systems,  and  we  refer  the  reader  to  the  article  by  Chaki  and  colleagues  [Chaki  04c]  for 
details  regarding  the  COPPER  abstraction  and  refinement  procedures.  The  next  section  describes  the 
problem  of  verifying  evolving  software  and  presents  our  solution  to  address  it.  This  solution  was 
originally  published  by  Chaki  and  colleagues  [Chaki  05a]. 
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3  Verification  of  Evolving  Software 


Successfully  transitioning  model  checking  technology  has  proven  to  be  a  challenging  task.  While  the 
benefits  of  successful  model  checking  are  clear,  there  are  several  barriers  to  successful  transition. 
Principally,  model  checking  has  serious  scalability  problems,  and  the  techniques  are  difficult  for 
software  engineers  to  use. 

A  major  shortcoming  in  most  model  checking  research  is  the  failure  to  consider  how  to  make  the  use 
of  model  checking  routine  throughout  various  stages  of  software  development.  Software  inevitably 
evolves  as  designs  take  shape,  requirements  change,  and  bugs  are  discovered  and  fixed.  Model 
checking  is  useful  at  each  such  point,  but  the  current  state  of  model  checking  requires  that  software 
verification  of  the  entire  system  be  performed  anew  each  time.  The  time  and  effort  required  to  verify 
an  entire  system  can  be  considerable,  and  repeating  the  exercise  after  each  change,  no  matter  how 
small,  would  likely  discourage  use. 

In  this  report,  we  present  ways  to  reduce  the  effort  of  subsequent  verifications.  In  particular,  by 
exploiting  the  results  of  previous  verification  efforts  and  focusing  only  on  the  portions  of  the  system 
that  have  changed  (components),  model  checking  can  be  incorporated  into  development  processes  in  a 
much  less  intrusive  or  cumbersome  manner. 

We  present  techniques  that,  while  not  affecting  the  initial  model  checking  effort,  reduce  by  orders  of 
magnitude  the  effort  to  keep  analysis  results  up  to  date  with  evolving  system  design.  The  techniques 
are  decision  procedures  that  determine  if  all  system-correctness  properties  previously  established  by 
model  checking  remain  valid  for  the  new  version  of  the  system. 

The  key  idea  is  to  determine  automatically  if  these  properties  hold  for  the  new  system  without 
repeating  each  of  the  individual  verification  checks.  We  present  a  verification  method  [Chaki  05a]  that 
focuses  on  system  components  that  have  changed  during  the  evolution  of  software  and  determines  if 
all  behaviors  of  the  original  system  are  preserved  in  the  new  version  of  the  system.  Moreover, 
whenever  behaviors  are  not  preserved,  our  technique  automatically  provides  feedback  to  developers 
showing  how  to  improve  the  components  whenever  possible. 

3.1  Background  and  Notation 

Let  •  denote  the  concatenation  operator  over  sequences,  and  let  X*  denote  zero  or  more  applications 
of  •  over  X  as  usual.  For  any  two  sets  X  and  Y,  we  will  denote  the  set  {x  •  y  \  x  £  X  A  y  £  Y}  by 
X  •  Y. 

Definition  1  (Words  and  Traces)  Given  an  alphabet  Y  and  a  set  of  atomic  propositions  AP,  we 
often  say  that  (£,  AP)  is  a  state/event  (SE)  alphabet.  For  an  SE  alphabet  Y  =  (£,  AP),  the  set  of 
words  over  Y  is  denoted  by  Word(Y)  and  defined  as  Word(Y)  =  (Y  •  2AP)*.  The  set  of  traces  over 
Y  is  denoted  by  Trace(Y)  and  defined  as  Trace{Y)  =  2AP  •  Word(Y). 

Thus,  a  word  or  a  trace  is  an  alternating  sequence  of  subsets  of  AP  and  elements  of  Y.  However,  a 
word  always  begins  with  an  action,  ends  with  a  set  of  propositions,  and  can  be  empty.  In  contrast,  a 
trace  begins  and  ends  with  a  set  of  propositions  and  cannot  be  empty. 
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Definition  2  (Doubly  Labeled  Automaton)  A  doubly  labeled  automaton  (DLA)  is  a  7 -tuple 
(5,  Init,  AP,  C,  E,  <5,  F)  such  that  (i)  S  is  a  finite  set  of  states,  (ii)  Init  C  S  is  a  set  of  initial  states, 
(Hi)  AP  is  a  finite  set  of  (atomic)  state  propositions,  (iv)  C  :  5  — >  2 AP  is  a  state-labeling  function,  (v) 
E  is  a  finite  set  of  events  or  actions  (alphabet),  (vi)  5  C  S  x  E  X  S  is  a  transition  relation,  and  (vii) 

F  C  S  is  a  set  of  final  or  accepting  states. 


For  any  DLA  with  transition  relation  5,  we  write  q  q'  to  mean  q'  £  8(q,a).  A  DLA  is  said  to  be 
deterministic  if  for  any  q  £  S,  a  £  E,  and  p  C  AP,  there  is  at  most  one  <fi  £  S  such  that  q  — >  q1  and 
£((/)  =  p.  DLAs  are  not  more  expressive  than  standard  finite  automata,  since  propositional  labelings 
can  always  be  rewritten  in  terms  of  actions  [Clarke  00b].  However,  we  choose  to  use  the  DLA 
formalism  for  the  sake  of  simplicity  because  it  captures  the  essence  of  the  SE-based  notation. 


Definition  3  (Language)  Let  M  =  (5,  Init ,  AP ,  C,  E,  8,  F)  be  a  DLA  and  E  =  (E,  AP).  A  trace 
t  £  Trace(E)  is  accepted  by  M  ift  =  p\,  ot\,p2,  ■  ■  ■ ,  an-i,pn,  and  there  exists  a  sequence 
si,  S2,  ■  ■  ■ ,  sn  of  states  of  M  such  that  (i)  sq  £  Init,  (ii)  sn  £  F,  (iii)for  1  <  i  <  n,  C(si)  =  Pi,  and 
(iv)for  1  <  i  <  n,  Si  -^4  Si+i.  The  language  of  M  is  denoted  by  L(M)  and  defined  as  the  set  of  all 
traces  accepted  by  M. 


A  language  is  said  to  be  regular  iff  it  is  accepted  by  some  DLA.  The  set  of  regular  languages  is  closed 
under  union,  intersection,  and  complementation.  Deterministic  DLAs  (DDLAs)  are  equivalent  to 
DLAs  as  far  as  language  acceptance  is  concerned.  In  other  words,  for  any  regular  language  L  there  is 
a  DDLA  M  such  that  L(M)  =  L.  Also  every  regular  language  L  is  accepted  by  a  unique  (up  to 
isomorphism)  minimal  DDLA. 


Definition  4  (Abstraction)  Given  two  DLAs  M\  and  M2,  we  say  that  M2  is  an  abstraction  of  M\, 
denoted  by  M\  C  M2,  iffL(M\)  C  L(M2). 


Definition  5  (Parallel  Composition)  Let  M\  =  (Si,  Init\ ,  AP  \.L\,  E 1 ,  A 1 ,  1 )  and 

M2  =  (S2 j  Init 2,  AP 2,  C2,  E2,  82,  F2)  be  two  DLAs.  The  parallel  composition  of  M\  and  M2, 
denoted  by  M\  ||  M2,  is  the  DLA  (Si  x  52,  Initi  x  Init2,  AP  1  U  AP2,  £,  Ei  U  E2,  8,  Fi  x  F2), 
where  (i)  C(si,  S2)  =  Ci(.si)  U  £2(^2)  and  (ii)  8  is  such  that  (si,  S2)  — +  (s^,  sf)  iff 

Mi  £  {1,2}  .  (a  <£  Ej  A  A;  =  s')  \J  (a  £  E*  A  s*  s') 


In  other  words,  DLAs  must  synchronize  on  shared  actions  and  proceed  independently  on  local  actions. 
This  notion  of  parallel  composition  is  derived  from  the  Communicating  Sequential  Process  (CSP) 
formalism  [Roscoe  98]. 


Definition  6  (Weakest  Assumption)  For  any  DLA  M  and  any  safety  property  expressed  as  a  DLA  ip, 
there  exists  a  weakest  (w.r.t.  the  C  preorder)  DLA,  which  we  denote  as  WA,  with  the  following 
property:  for  any  DLA  E,  M  ||  E  C  <p  iff  E  C  WA  [ Giannakopoulou  02 ].  In  fact,  it  can  be  shown 
that  WA  is  a  DLA  accepting  the  language  L(M  ||  ip). 
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3.2  Containment 


Recall  that  in  the  containment  step,  we  verify  for  each  i  £  1,  that  C\  C  C-  (i.e.,  every  behavior  of  C\ 
is  also  a  behavior  of  Ct).  If  Ci%  Ci,  we  construct  a  set  Ti  of  behaviors  in  Behv(Ci)  \  Behv^Cj^ ), 
which  will  be  used  subsequently  for  feedback  generation.  This  containment  check  is  performed 
iteratively  and  component-wise  as  depicted  in  Figure  3. 


Figure  3:  The  Containment  Phase  of  the  Substitutability  Framework 

For  each  i  £  1,  the  containment  check  proceeds  as  follows: 

1.  Abstraction:  Construct  finite  models  M  and  M'  such  that  (Cl)  C%  C  M  and 
(C2)  M'  C  Cj .  Note  that  M  is  an  overapproximation  of  C,  and  can  be 
constructed  by  standard  predicate  abstraction.  However,  M'  is  constructed  from 
Ci  via  a  modified  predicate  abstraction  that  produces  an  underapproximation  of 
its  input  C  program. 

Standard  predicate  abstraction  constructs  an  overapproximation  of  the  concrete 
system  via  existential  abstraction.  In  doing  so,  it  checks  the  validity  of  formulas 
using  a  theorem  proven  Intuitively  these  formulas  express  conditions  under  which 
a  transition  is  possible  between  a  pair  of  abstract  states.  Our  modified  predicate 
abstraction  constructs  a  universal  approximation  by  modifying  these  formulas 
appropriately,  so  they  represent  conditions  under  which  a  transition  is  inevitable 
between  a  pair  of  abstract  states. 

2.  Verification:  Verify  that  M  C  M' .  If  so,  then  from  (Cl)  and  (C2)  above,  we 
know  that  Ct  LI  Ct ,  and  we  terminate  with  success.  Otherwise  we  obtain  a 
counterexample  CE. 

3.  Validation  1:  Check  if  CE  is  a  real  behavior  of  Ci.  If  so,  we  proceed  to  the  next 
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step.  Otherwise  we  refine  model  M  and  repeat  the  process  from  Step  2.  This 
validation  and  refinement  step  is  done  according  to  the  CEGAR  procedure 
implemented  in  the  MAGIC  tool  [Chaki  04c]. 

4.  Validation  2:  Check  if  CE  is  not  a  real  behavior  of  C-.  If  it  is  not,  we  know  that 
CE  £  Behv(Cj)  \  Behv(C-).  We  add  CE  to  T,  and  stop.  Otherwise  we  refine 
M'  and  repeat  the  process  from  Step  2.  This  second  validation  and  refinement  step 
is  an  antithesis  of  standard  abstraction  refinement  because  it  adds  the  valid 
behavior  CE  back  to  M' .  However,  it  is  conceptually  similar  to  standard 
abstraction-refinement,  and  we  omit  its  details  in  this  report. 

The  above  process  terminates  as  soon  as  it  adds  a  single  behavior  to  However,  it  can  be  modified 
easily  to  generate  a  set  of  behaviors  in  Ti  as  follows.  Construct  a  set  of  counterexamples  CE  in  Step 
2.  Then  process  each  element  of  CE  via  Steps  3  and  4  and  add  to  J-%  every  counterexample  that 
belongs  to  Q  but  not  to  Ci .  The  next  section  describes  the  use  of  Tt  to  provide  feedback  to 
developers,  showing  how  to  correct  the  updated  components. 


3.3  Compatibility 

Recall  that  the  compatibility  check  is  aimed  at  ensuring  that  the  upgraded  system  satisfies  global 
safety  specifications.  Our  compatibility  check  procedure  involves  two  key  paradigms:  dynamic 
regular-set  learning  and  assume-guarantee  reasoning.  We  first  present  these  two  techniques  and  then 
describe  their  use  in  our  overall  compatibility  algorithm. 

3.3.1  Dynamic  Regular-Set  Learning 

Central  to  our  compatibility  check  procedure  is  a  new  dynamic  algorithm  to  learn  regular  languages. 
Our  algorithm  is  based  on  the  L*  algorithm  developed  by  Angluin  [Angluin  87].  The  compatibility 
check  uses  a  state/event  version  of  the  L*  that  is  a  straightforward  extension  of  the  original  algorithm 
(for  simplicity  we  will  refer  to  both  as  L*).  The  detailed  description  of  the  state/event  L*  algorithm 
and  the  proof  of  its  correctness  and  complexity  analysis  can  be  found  in  a  white  paper  by 
Chaki  [Chaki  05b].  We  will  first  present  the  state/event  learning  algorithm  and  then  describe  a 
dynamic  version  of  it  that  we  actually  use  for  checking  compatibility.  We  will  denote  the  symmetric 
difference  of  two  sets  X  and  Y  by  X  ©  Y  (i.e.,  p  £  X  (BY  iff  p  £  X  \  Y  or  p  £Y  \  X). 

3.3. 1.1  The  L*  Algorithm 

Let  U  be  an  unknown  regular  language  over  some  SE  alphabet  E  =  (E,  AP).  In  order  to  learn  U,  L* 
interacts  with  a  minimally  adequate  teacher  MAT  for  U,  which  can  provide  Boolean  answers  to  the 
following  two  kinds  of  queries: 

1.  membership:  Given  a  p  £  Trace(Y,),  MAT  returns  TRUE  iff  p  £  U. 

2.  candidate:  Given  a  DDLA  D,  MAT  returns  TRUE  iff  h(D)  =  U.  If  MAT  returns 
FALSE,  it  also  returns  a  counterexample  trace  w  £  L(/7)  ©  U. 

Given  an  unknown  regular  language  U  C  Trace (E)  and  a  MAT  for  U.  the  L*  algorithm  iteratively 
constructs  a  minimal  DDLA  D  such  that  L{D)  =  U.  It  maintains  an  observation  table  (S,  E,  T) 
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where  (i)  S  is  a  prefix-closed  set  over  Trace  (T,)  labeling  the  rows  of  the  table,  (ii)  E  is  a  suffix-closed 
set  over  Word(E )  labeling  the  columns  of  the  table,  and  (iii)  T :  (5  U  S  •  X)  x  E  — >  {0, 1}  is  the 
valuation  of  the  table  entries  such  that 

VsGSuS.S.VeeE.  T[s ,  e]  =  1  s  •  e  E  E7 

Additionally,  for  any  s  E  S  U  S  •  X,  let  us  define  a  function  rs  as  follows: 

Ve  E  E  .  rs(e)  =  T[s,  e] 

Given  a  trace  t  E  Trace(S),  we  write  Last(t )  to  mean  the  last  set  of  propositions  in  t.  L*  always 
ensures  that  the  following  invariant  holds  on  the  table:  for  any  two  distinct  si,  S2  E  S,  either  rSl  f  r,2 
or  Last(si)  f  Last(sf).  The  table  is  said  to  be  closed  if,  for  every  t  E  S  •  £,  there  exists  an  s  E  S 
such  that  rs  =  rt  and  Last(s)  =  Last(t). 

Let  us  denote  the  empty  word  by  A.  Then  L*  starts  with  a  table  (S,  E,  T )  such  that  S  =  2AP , 

E  =  {A},  and  each  iteration  proceeds  as  follows.  It  first  updates  the  table  using  membership  queries 
until  it  is  closed.  Next  L*  builds  a  candidate  DDLA  D  from  the  table  and  makes  a  candidate  query 
with  D.  If  the  MAT  returns  TRUE  to  the  candidate  query,  L*  returns  D  and  stops.  Otherwise,  L* 
updates  E  with  a  single  word  (constructed  from  the  CE  returned  by  the  candidate  query)  and 
proceeds  with  the  next  iteration.  The  complexity  of  L*  is  expressed  by  the  following 
theorem  [Angluin  87,  Chaki  05b]: 


Theorem  1  If  n  is  the  number  of  states  of  the  minimum  DDLA  accepting  U,  and  m  is  the  upper 
bound  on  the  length  of  any  counterexample  provided  by  the  MAT,  then  the  total  running  time  of  L*  is 
bounded  by  a  polynomial  in  rri  and  n.  Moreover,  the  observation  table  is  of  size  0(m2n 2  +  mn3). 

3.3.1. 2  Dynamic  L* 

Normally  L*  initializes  with  S  =  2AP  and  E  =  {A}.  This  can  be  a  drawback  in  cases  where  a 
previously  learned  candidate  (and  hence  a  table)  exists  and  we  wish  to  restart  learning  using 
information  from  the  previous  table.  In  the  following  discussion,  we  show  that  if  L*  begins  with  any 
non-empty  valid  table,  it  must  terminate  with  the  correct  result  (Theorem  2).  In  particular,  this 
theorem  allows  us  to  perform  our  compatibility  check  dynamically  by  restarting  L*  with  any 
previously  computed  table  by  revalidating  it  instead  of  starting  from  an  empty  table.1 


Definition  7  (Agreement)  An  observation  table  ( S ,  E,  T)  is  said  to  agree  with  a  regular  language  U 
iff:  V(s,  e)  E  (S  U  S  •  £)  x  E,  T(s,  e)  =  1  iff  s  •  e  E  U.  Also,  (S,  E,  T)  agrees  with  a  candidate 
DDLA  D  if  it  agrees  with  L(D). 


Definition  8  (Validity)  An  observation  table  T  =  (S,  E,  T)  is  said  to  be  valid  for  a  language  U  iff 
(S,  E,  T)  agrees  with  U.  We  say  that  a  candidate  derived  from  a  closed  table  T  is  valid  ifT  is  valid. 


Theorem  2  L*  terminates  with  a  correct  result  for  any  unknown  language  U  starting  from  any  valid 
table  for  U. 


1  A  similar  idea  was  also  proposed  in  the  context  of  adaptive  model  checking  [Groce  02], 
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Proof.  Let  n  be  the  number  of  states  in  the  minimal  DDLA  Mjj  such  that  L (Mu)  =  U.  Note  that  both 
Theorem  1  and  Lemma  5  from  Angluin’s  correctness  proof  for  L*  [Angluin  87]  hold  true  for  valid  and 
closed  tables  and  candidates  consistent  with  them.  It  follows  from  Theorem  1  and  Lemma  5  that  L* 
can  always  make  a  valid  table  closed  and  hence  is  able  to  construct  a  candidate,  say  D,  with  at  most  n 
states.  We  now  show  that  every  subsequent  candidate  must  have  at  least  one  more  state  than  D. 

A  candidate  query  with  D  either  returns  TRUE  or  a  counterexample  CE  G  h(D)  ©  U.  Note  that  the 
table  must  agree  with  D  since  D  is  consistent  with  it.  Also  since  the  table  is  valid,  it  must  agree  with 
U.  Therefore,  CE  f  (S  IJ  S  •  '£  )  •  E  and  will  be  added  to  S.  Again,  a  valid  and  closed  table 
(S',  E',  T1)  must  be  obtained  eventually  after  adding  CE.  Let  D ’  be  the  corresponding  candidate. 

Now,  D'  is  consistent  with  T  since  V  extends  T.  Also  D'  agrees  with  Mjj  as  far  as  accepting  CE  is 
concerned,  while  D  does  not.  Hence  D'  is  inequivalent  to  D,  and,  according  to  Theorem  1  in 
Angluin’s  proof,  it  must  have  at  least  one  more  state  than  D.  Hence,  starting  from  D,  L*  can  make  at 
most  n  —  1  incorrect  candidates,  since  the  number  of  states  is  initially  at  least  one,  always  increases 
monotonically,  and  may  not  exceed  n  —  1.  Since  L*  must  continue  making  new  candidates  as  long  as 
it  is  running,  it  must  terminate  with  a  correct  candidate  Mu- 

Suppose  we  have  a  table  T  that  is  valid  for  an  unknown  language  U,  and  we  have  a  new  unknown 
language  U'  different  from  U.  Suppose  we  want  to  learn  U'  by  starting  L*  with  table  T.  Note  that  in 
general  T  will  not  be  valid  for  U’\  hence,  starting  from  T  will  not  be  appropriate.  However,  we  can 
first  validate  T  against  U'  and  then  start  L*  from  the  validated  T.  Theorem  2  provides  the  key  insight 
behind  the  correctness  of  this  procedure.  As  we  shall  see,  this  idea  forms  the  backbone  of  our  dynamic 
compatibility-check  procedure  (see  Section  3.3.3). 

3.3.2  Assume-Guarantee  Reasoning 

Along  with  dynamic  L* ,  we  also  use  assume-guarantee  style  compositional  reasoning  to  check 
compatibility.  Given  a  set  of  component  DLAs  Ml , . . . ,  Mn  and  a  specification  DLA  p,  the  following 
non-circular  rule  AG  [Pnueli  85]  can  be  used  to  verify  Mi  ||  •  ■  •  ||  Mn  C  p: 

Mi  ||  A\  C  p 

M2  ||  •  •  •  ||  Mn  C  A\ 

Mi  ||  •  •  •  ||  Mn  C  p 


In  the  above  equation,  A\  is  a  DLA  representing  the  assumption  about  the  environment  under  which 
Mi  is  expected  to  operate  correctly.  As  also  observed  by  Cobleigh  and  colleagues  [Cobleigh  03],  the 
second  premise  is  itself  an  instance  of  the  top-level  proof  obligation  with  n  —  1  component  DLAs. 
Hence,  AG  can  be  applied  to  decompose  it  further. 

3.3.3  Compatibility  Check  for  C  Components 

The  procedure  for  checking  compatibility  of  new  components  in  the  context  of  the  original  component 
assembly  is  presented  in  Figure  4.  Given  an  old  component  assembly  C  =  {C\, ... ,  (7n}  and  a  set  of 
new  components  C  =  {C[  \  i  G  2}  (where  T  C  { 1, . . . .  n}),  the  compatibility-check  procedure 
checks  if  a  safety  property  p  holds  in  the  new  assembly.  We  first  present  an  overview  of  the 
compatibility  procedure  and  then  discuss  its  implementation  in  detail.  The  procedure  uses  a 
DynamicCheck  algorithm  and  is  done  in  an  iterative  abstraction-refinement  style  as  follows: 


16 


CMU/SEI-2005-TR-008 


1.  Use  predicate  abstraction  to  obtain  finite  DLA  models  Mi,  where  Mt  is 
constructed  from  Q  if  *  0  2  and  from  C'  if  i  Gl.  The  abstraction  is  earned  out 
component-wise.  Let  A4  =  {Mi, . . . ,  Mn}. 

2.  Apply  DynamicCheck  on  Ad.  If  the  result  is  TRUE,  the  compatibility  check 
terminates  successfully.  Otherwise,  we  obtain  a  counterexample  CE. 

3.  Check  if  CE  is  a  valid  counterexample.  Once  again  this  is  done  component-wise. 
If  CE  is  valid,  the  compatibility  check  terminates  unsuccessfully  with  CE  as  a 
counterexample.  Otherwise  we  go  to  the  next  step. 

4.  Refine  a  specific  model,  say  M*.,  such  that  the  spurious  CE  is  eliminated.  Repeat 
the  process  from  Step  2. 


Figure  4:  The  Compatibility  Phase  of  the  Substitutability  Framework 


3.3.3.1  Overview  of  DynamicCheck 

We  first  present  an  overview  of  the  algorithm  for  two  DLAs  and  then  generalize  it  to  an  arbitrary 
collection  of  DLAs.  Suppose  we  have  two  old  DLAs,  Mi  and  M2,  and  a  property  DLA  p.  We  assume 
that  we  previously  tried  to  verify  Mi  ||  M2  C  p  using  DynamicCheck.  The  algorithm 
DynamicCheck  uses  dynamic  L*  to  learn  appropriate  assumptions  that  can  discharge  the  premises  of 
AG.  In  particular,  suppose  that  while  trying  to  verify  M\  ||  M2  C  p,  DynamicCheck  had  constructed 
an  observation  table  T. 

Now  suppose  that  we  have  new  versions  M[  and  M2  for  Mi  and  M2.  Note  that,  in  general,  either  M[ 
or  M2  could  be  identical  to  its  old  version.  DynamicCheck  will  now  reuse  T  and  invoke  the  dynamic 
L*  algorithm  to  automatically  learn  an  assumption  A1  such  that  (i)  M\  |  ,4'  C  p  and  (ii)  M2  C  A' . 
More  precisely,  DynamicCheck  proceeds  iteratively  as  follows: 

1.  It  checks  if  Mi  =  M{.  If  so,  it  starts  learning  from  the  previous  table  T  (i.e.,  it 
sets  T'  :=  T).  Otherwise,  it  revalidates  T  against  M{  to  obtain  a  new  table  T' . 
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2.  It  derives  a  conjecture  A!  from  T'  and  checks  if  M'2  C  A'.  If  this  check  passes,  it 
terminates  with  TRUE  and  the  new  assumption  A' .  Otherwise,  it  obtains  a 
counterexample  CE. 


3.  It  analyzes  CE  to  see  if  CE  corresponds  to  a  real  counterexample  to 

M[  ||  M'2  C  <p.  If  so,  it  constructs  such  a  counterexample  and  terminates  with 
FALSE.  Otherwise,  it  updates  T'  using  CE. 

4.  It  makes  T'  closed  by  making  membership  queries  and  repeats  the  process  from 
Step  2. 

3. 3. 3. 2  Generalized  DynamicCheck 

We  first  describe  the  key  ideas  that  enable  us  to  reuse  the  previous  assumptions  and  then  present  the 
complete  DynamicCheck  algorithm  for  multiple  DLAs.  Due  to  its  dynamic  nature,  the  algorithm  will 
be  able  to  locally  identify  the  set  of  assumptions  that  must  be  modified  to  revalidate  the  system. 

Incremental  Changes  Between  Successive  Assumptions.  Recall  that  the  L*  algorithm 
maintains  an  observation  table  (S,  E.  T )  corresponding  to  an  assumption  A  for  every  component  M. 
During  an  initial  compatibility  check,  this  table  stores  the  information  about  membership  of  the 
current  set  of  traces  in  an  unknown  language  U  (i.e.,  the  language  of  the  weakest  assumption  for  M). 
Upgrading  the  component  M  modifies  this  unknown  language  for  the  corresponding  assumption  from 
U  to,  say,  U' .  Therefore,  checking  compatibility  after  an  upgrade  requires  that  the  learner  must 
compute  a  new  assumption  A'  corresponding  to  U'.  In  most  cases,  the  languages  L(A)  and  L(A') 
may  differ  only  slightly;  hence,  the  information  about  the  behaviors  of  A  is  reused  in  computing  A' . 

Table  Revalidation.  The  original  L*  algorithm  computes  A'  starting  from  an  empty  table.  However, 
as  mentioned  before,  a  more  efficient  algorithm  would  intend  to  reuse  the  previously  inferred  set  of 
elements  of  S  and  E  to  learn  A! .  The  result  in  Section  3.3. 1.2  (Theorem  2)  precisely  enables  the  L* 
algorithm  to  achieve  this  goal.  In  particular,  since  L*  terminates  starting  from  any  valid  table,  the 
assumption  learner  first  obtains  a  valid  table  by  reusing  words  in  S  and  E:  update  T  by  asking 
membership  queries  with  regard  to  U'  for  each  p  e  (S  U  S  •  E)  •  E.  The  valid  table  (5,  E,  T') 
thereby  obtained  is  subsequently  made  closed,  and  then  learning  proceeds  in  the  normal  fashion. 
Doing  this  allows  the  compatibility  check  to  restart  from  any  previous  set  of  assumptions  by 
revalidating  them.  The  GenerateAssumption  module  implements  this  feature  (see  Figure  5). 

Overall  DynamicCheck  Procedure.  The  DynamicCheck  procedure  instantiates  the  AG  rule  for 
n  components  and  enables  checking  multiple  upgrades  simultaneously  by  reusing  previous 
assumptions  and  verification  results.  In  the  description,  we  denote  the  previous  and  new  versions  of  a 
component  DLA  by  M  and  M'  and  the  previous  and  new  versions  of  component  assemblies  by  M. 
and  M' ,  respectively.  For  ease  of  description,  we  always  use  a  property,  <p,  to  denote  the  right-hand 
side  of  the  top-level  proof  obligation  of  the  compositional  rule.  We  denote  the  modified  property2  at 
each  recursion  level  of  the  algorithm  by  ip' .  The  old  and  new  assumptions  are  denoted  by  A  and  A' , 
respectively. 

Figure  5  presents  the  pseudo-code  of  the  DynamicCheck  algorithm  to  perform  the  compatibility 
check.  Lines  1-4  describe  the  case  when  A4  contains  only  one  component.  In  Line  5,  an  assumption 


2  Under  the  recursive  application  of  the  compatibility-check  procedure,  the  updated  property  ip'  corresponds  to  an  assump¬ 
tion  from  the  previous  recursion  level. 
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A'  corresponding  to  M'  and  p'  is  generated  using  dynamic  L*  such  that  AT  ||  A'  C  p' .  Lines  6-8 
describe  recursive  invocation  of  DynamicCheck  on  M.  \  M  against  property  A' .  Finally,  Lines  9-15 
show  how  the  algorithm  detects  a  counterexample  CE  and  updates  A'  with  it  or  terminates  with  a 
TRUE/FALSE  result.  The  salient  features  of  this  algorithm  are  the  following: 

•  GenerateAssumption  (Line  5)  does  not  generate  new  assumptions  every  time 
DynamicCheck  is  invoked.  Instead,  it  reuses  (by  revalidating  if  necessary)  the 
assumption  A  computed  in  the  previous  compatibility  check.  When  CE  is  used  to 
update  A,  GenerateAssumption  (Line  12)  does  not  need  to  revalidate  A  because 
it  had  to  be  validated  previously. 

•  Verification  checks  are  repeated  on  a  component  M'  (or  a  collection  of 
components  AT  \  AT)  only  if  it  is  (or  they  are)  found  to  be  different  from  the 
previous  version  M  (Mi  \  M)  or  if  the  corresponding  property  p  has  changed 
(Lines  3,  7,  12).  Otherwise,  the  previously  computed  result  is  reused  (Lines  4,  8). 

DynamicCheck  (Ml',  p')  returns  counterexample  or  true 
1:  let  M'  =  first  element  of  AT; 

2:  if  (AT  =  {AT}) 

3:  if  ( M  M'  or  p  p')  return  (AT  C  p'); 

4:  else  return  M 

5:  A'  \=  Genera  te As  sump  tion( M ' ,  p')\ 

6:  if(A^  A'orMl\Mjt  Ml'  \  AT  ) 

7:  CE  :=  DynamicCheck(AT  \  AT,  A'); 

8:  else  CE  :=  Dyn  amic  C  h  ec  k  ( Ml  \  M,  A); 

9:  while(  CE  is  non-empty) 

10:  if  (AT  ||  CE  C  p') 

11:  A'  :=  UpdateAssumption  (A' ,  CE); 

12:  A!  :=  GenerateAssumption  (AT,  p’)\ 

13:  CE  =  DynamicCheck  (AT  \  AT,  A'); 

14:  else  return  a  witness  counterexample  CE  to  AT  ||  CE  %  p'\ 

15:  return  true; 

Figure  5:  Pseudo-Code  for  Efficient  Compatibility  Checking 


The  correctness  of  DynamicCheck  follows  from  the  following  theorem. 


Theorem  3  Given  modified  Ml'  and  p' ,  the  DynamicCheck  algorithm  always  terminates  with  either 
TRUE  or  a  counterexample  CE  to  Ml'  C  p'. 


Proof.  The  notion  of  weakest  assumptions  is  used  in  proving  the  correctness  of  DynamicCheck.  For 
any  DLA  M,  there  must  exist  a  weakest  environment  assumption  DLA  WA  such  that  A I  ||  Ep  iff 
E  C  WA.  Suppose  we  have  a  system  of  components  M\, . . . ,  Mn  and  a  global  property  p.  Consider 
rules  of  the  form  Mi  |  A,;  C  Aj_i(l  <  i  <  n  —  l,  A$  =  p)  and  Mn  C  An_i  as  used  in  the  recursive 
procedure  DynamicCheck  to  show  that  M\  ||  ..  ||  Mn  C  p.  It  is  clear  that  a  weakest  assumption 
1474 1  exists  such  that  M\  ||  WA\  C  p.  Given  WA\,  it  follows  that  HAL  must  exist  so  that 
M2  ||  l'L49  E  WA\.  Therefore,  by  induction  on  i,  there  must  exist  weakest  assumptions  1474 ,  for 
1  <  i  <  n  —  1,  such  that  Mi  |j  WAj  C  4L4j_i(l  <  i  <  n  —  1,  WAq  =  p)  and  Mn  C  An_i .  Also,  by 
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Theorem  2,  U pdateAssum ption ( A ,  CE )  must  terminate  starting  from  any  valid  assumption  A1  with 
respect  to  U'  and  a  counterexample  CE  G  L(A')  ©  U' . 

Suppose,  without  loss  of  generality,  that  component  DLA  M'  is  upgraded.  Note  that  after  an  upgrade, 
a  weakest  assumption  WA'  (possibly  different  from  WA )  must  exist  for  every  M'  G  AT.  We  proceed 
by  induction  over  the  size  k  of  AT.  In  the  base  case,  it  is  clear  that  we  need  to  model  check  M' 
against  p'  only  if  either  M  or  p  changed  (Line  3).  By  performing  this  model  checking,  either  a 
counterexample  to  M'  C  p'  is  returned  or  the  previous  M  C  p  (Line  4)  result  holds. 

Assume  for  the  inductive  case  that  DynamicCheck( AA '  \  M' ,  A ')  terminates  with  either  TRUE  or  a 
counterexample  CE.  It  is  clear  from  its  definition  that  A'  computed  by  GenerateAssumption  (Line 
5)  is  valid.  If  Line  6  holds  (i.e.,  A!  f  A  or  M.  \  M  f  AT  \  M'),  then,  by  inductive  hypothesis, 
execution  of  Line  7  terminates  with  either  a  TRUE  result  or  a  counterexample  CE.  Otherwise,  the 
previously  computed  CE  result  is  used  (Line  8).  It  remains  to  be  shown  that  Lines  9-15  compute  the 
correct  return  value  based  on  this  result. 

If  this  result  is  TRUE,  it  follows  from  the  soundness  of  the  assume-guarantee  rule  that  AT  C  p'  and 
DynamicCheck  returns  TRUE  (Line  15).  If  M'  ||  CE  g  p'  (Line  10),  then,  by  set-theoretic  arguments 
based  on  the  definitions  of  A!  and  CE,  we  know  that  AT  fZ  P'  and  a  suitable  witness  CE'  (Line  14) 
is  returned  by  the  algorithm.  Otherwise,  since  A'  is  valid,  both  Update  Assumption  (Line  11)  and 
GenerateAssumption  (Line  12)  must  terminate  by  learning  a  new  assumption,  say  A” ,  such  that 
M'  |  A"  p' .  It  follows  from  the  proof  of  correctness  of  L*  that  |  A'\  <  \A"\  and  from  the  definition 
of  weakest  assumptions  that  \A"\  <  |  WA' |.  Also,  by  inductive  hypothesis,  Line  13  must  terminate 
with  the  correct  CE  result.  Hence,  Lines  9-13  of  the  while  loop  may  be  executed  only  a  finite  number 
of  times  until  \A"\  =  |  WA'\,  when  (by  set-theoretic  arguments)  either  the  result  is  TRUE  (Line  15)  or  a 
witness  counterexample  CE'  (Line  14)  for  AT  g  P'  is  returned. 

Further  Optimizations.  Recall  that  our  procedure  reuses  assumptions  generated  during  previous 
compatibility  checks.  We  further  optimize  it  by  identifying  a  subset  of  assumptions  that  must  be 
revalidated  at  the  initialization  of  the  next  check.  This  optimization  is  enabled  by  the  following  lemma 
whose  proof  follows  directly  from  Theorem  3  and  the  definition  of  weakest  assumptions. 


Lemma  1  Let  Ad  =  {Mi, . . . ,  Mn}  be  an  assembly  of  components;  let  A  =  { ,4 1 . . . . ,  A„_  | }  be  a  set 
of  previously  computed  assumptions;  and  letZ  C  {1, . . .  ,n}  be  an  index  set.  Also,  let  { \  i  G  1} 
be  the  set  of  new  components.  Ifk  is  the  minimum  index  ofZ,  then  it  is  sufficient  for  DynamicCheck 
to  revalidate  only  the  assumptions  in  the  set  {Aj  \  j  >  k  A  j  <  n}. 


3.4  Feedback 

Recall  that  for  some  i  Gl,  if  our  containment  check  detects  that  C\  C-,  it  also  computes  a  set  A). 
Intuitively  each  element  of  A  represents  a  behavior  of  C,  that  is  not  a  behavior  of  Ci .  We  now  present 
our  process  of  generating  feedback  from  Aj.  In  the  rest  of  this  section,  we  will  write  C ,  C  .  and  A  to 
mean  6),  Ci ,  and  Aj,  respectively. 

Consider  any  behavior  n  in  A.  Recall  that  n  is  a  trace  of  a  DLA  M  obtained  by  predicate  abstraction 
of  C.  By  simulating  ir  on  M,  we  construct  an  alternating  sequence  Rep{ tt)  =  (s±,  cu, . . . ,  sn)  of 
states  and  actions  of  M  corresponding  to  tt.  Recall  from  our  earlier  discussion  of  predicate  abstraction 
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(see  Section  3.2)  that  each  s,  is  of  the  form  ( sti ,  V*),  where  sti  is  a  statement  of  C  and  V,;  is  a 
predicate  valuation.  Thus,  Rep{ir)  =  ((sii,  Vi),  cti, . . . ,  (stn,  Vn)). 

We  also  know  that  it  represents  an  actual  behavior  of  C  but  not  an  actual  behavior  of  Cj' .  Thus,  there 
is  a  prefix  Pref{ tv)  of  7 r  such  that  Pref{jr )  represents  a  behavior  of  C  .  However,  any  extension  of 
Pref{n )  is  no  longer  a  valid  behavior  of  C  .  Note  that  Pref  ( it)  can  be  constructed  by  simulating  7r  on 
C  .  Let  us  denote  the  suffix  of  n  after  Prefix)  by  Suff  {tt).  Since  Pref  {tt)  is  an  actual  behavior  of 
C  ,  we  can  also  construct  a  representation  for  Pref{ tt)  in  terms  of  the  statements  and  predicate 
valuations  of  C  .  Let  us  denote  this  representation  by  Rep' {Pref  {tt)) . 

As  our  feedback,  we  produce  as  output,  for  each  7 r  eT,  the  following  representations: 

Rep{Pref{ tt)),  Rep{Suff{Tr)),  and  Rep' {Pref  {tt)) .  Such  feedback  allows  us  to  identify  the  exact 
divergence  point  of  7r  beyond  which  it  ceases  to  correspond  to  any  concrete  behavior  of  C  .  Since  the 
feedback  refers  to  a  program  statement,  it  allows  us  to  understand  at  the  source  code  level  why  C  is 
able  to  match  7r  completely,  but  C  is  forced  to  diverge  from  tt  beyond  Pref  (7 r) .  This  understanding 
makes  it  easier  to  modify  C  so  that  the  missing  behavior  7r  can  be  added  back  to  it. 


3.5  Implementation  and  Experimental  Evaluation 

The  procedures  for  checking,  in  a  dynamic  manner,  the  substitutability  of  components,  were 
implemented  in  the  COPPER  model  checker  [Chaki  05c].  The  tool  includes  a  front  end  for  parsing  and 
constructing  control-flow  graphs  from  C  programs.  Further,  it  is  capable  of  model  checking  properties 
on  programs  based  on  automated  may-abstraction  (existential  abstraction),  and  it  allows 
compositional  verification  by  employing  learning-based,  automated  assume-guarantee  reasoning.  We 
reused  the  above  features  of  COPPER  in  the  implementation  of  the  substitutability  check.  The  tool 
interface  was  modified  so  a  collection  of  components  and  corresponding  upgrades  could  be  specified. 
We  extended  the  learning-based,  automated  assume-guarantee  to  obtain  its  dynamic  version,  as 
required  in  the  compatibility  check.  Doing  this  involved  keeping  multiple  learner  instances  across 
calls  to  the  verification  engine  and  implementing  algorithms  to  validate  multiple,  previous  observation 
tables  in  an  efficient  way  during  learning.  We  also  implemented  the  underapproximation  generation 
algorithms  for  performing  the  containment  check  on  small  program  examples.  Doing  this  involved 
procedures  for  implementing  must-abstractions  from  C  code  using  predicates  obtained  from  C 
components.  The  automated  refinement  procedures  are  still  under  implementation  and  would  enable 
containment  check  of  larger  benchmarks. 

We  validated  the  component  substitutability  framework  while  verifying  upgrades  of  a  benchmark 
provided  to  us  by  our  industrial  partner,  ABB  Inc.  [ABB  05].  The  benchmarks  consist  of  seven 
components  which  together  implement  an  interprocess  communication  (IPC)  protocol.  The  combined 
state  space  is  over  106. 

We  used  a  set  of  properties  describing  the  functionality  of  the  verified  portion  of  the  IPC  protocol.  We 
used  upgrades  of  the  write-queue  ( ipc\ )  and  the  ipc-queue  (ipc- 2,  and  ipep  components.  The  upgrades 
had  both  missing  and  extra  behaviors  compared  to  their  original  versions.  We  verified  two  properties 
(Pi  and  P2)  before  and  after  the  upgrades.  We  also  verified  the  properties  on  a  simultaneous  upgrade 
(ipc 4)  of  both  the  components.  Pi  specifies  that  a  process  may  write  data  into  the  ipc-queue  only  after 
it  obtains  a  lock  for  the  corresponding  critical  section.  P2  specifies  an  order  in  which  data  may  be 
written  into  the  ipc-queue.  Figure  6  shows  the  comparison  between  the  time  required  for  initial 
verification  of  the  IPC  system,  and  the  time  taken  by  DynamicCheck  for  verifying  the  upgrades.  In 
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Figure  6,  j^Mem.  Queries  denotes  the  total  number  of  membership  queries  made  during  verification 
of  the  original  assembly,  Torig  denotes  the  time  required  for  the  verification  of  the  original  assembly, 
and  Tug  denotes  the  time  required  for  the  verification  of  the  upgraded  assembly. 


Upgrade  #  (Prop.) 

#  Mem.  Queries 

Torig  (msec) 

Tug  (msec) 

ipci(Pi) 

279 

2260 

13 

ipc\{P2) 

308 

1694 

14 

vpc2{P\) 

358 

3286 

17 

ipc2(P2) 

232 

805 

10 

ipc-z{Pi) 

363 

3624 

17 

ipc3(P2) 

258 

1649 

14 

ipCi{P\ ) 

355 

1102 

24 

Figure  6:  Summary  of  Results  for  DynamicCheck 


We  observed  that  the  previously  generated  assumptions  in  all  the  cases  were  also  sufficient  to  prove 
the  properties  on  the  upgraded  system.  Hence,  the  compatibility  check  succeeded  in  a  small  fraction 
of  time  (Tug)  as  compared  to  the  time  for  compositional  verification  ( Torig )  of  the  original  system. 
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4  Related  Work 


Related  projects  often  impose  the  restriction  that  every  behavior  of  a  new  component  must  also  be  a 
behavior  of  the  old  component.  In  such  a  case,  the  new  component  is  said  to  refine  the  old  component. 
For  instance,  de  Alfaro  and  colleagues  [de  Alfaro  01,  Chakrabarti  02]  define  a  notion  of  interface 
automaton  for  modeling  component  interfaces  and  show  compatibility  between  components  via 
refinement  and  consistency  between  interfaces.  However,  automated  techniques  for  constructing 
interface  automata  from  component  implementations  are  not  presented.  In  contrast,  our  approach 
automatically  extracts  conservative  DLA  models  (which  are  similar  to  finite-state  interface  automata) 
from  component  implementations.  Moreover,  we  do  not  require  refinement  among  the  old 
components  and  their  new  versions. 

McCamant  and  Ernst  [McCamant  04]  suggest  a  technique  for  checking  compatibility  of 
multi-component  upgrades.  They  derive  consistency  criteria  by  focusing  on  input/output  component 
behavior  only  and  abstract  away  the  temporal  information.  Even  though  they  state  that  their 
abstractions  are  unsound  in  general,  they  report  success  in  detecting  important  errors.  In  contrast,  our 
abstractions  preserve  temporal  information  about  component  behavior  and  are  always  sound.  They 
also  use  a  refinement-based  notion  on  the  generated  consistency  criteria  for  showing  compatibility. 

The  application  of  learning  is  extremely  useful  from  a  pragmatic  point  of  view  since  it  is  amenable  to 
complete  automation,  and  it  is  gaining  rapid  popularity  in  formal  verification  [Groce  02].  The  use  of 
learning  for  automated  assume-guarantee  reasoning  was  proposed  originally  by  Cobleigh  and 
colleagues  [Cobleigh  03].  The  use  of  learning  along  with  predicate  abstraction  has  also  been  applied 
in  the  context  of  interface  synthesis  [Alur  05]  and  various  types  of  assume-guarantee  proof  rules  for 
automated  software  verification  [Chaki  04a]. 

This  work  is  related  to  our  earlier  project  [Chaki  04b]  that  solves  the  component-substitutability 
problem  in  the  context  of  verifying  individual  component  upgrades.  A  major  improvement  of  the 
current  work  is  that  it  is  aimed  at  verifying  the  component  substitutability  in  the  presence  of 
simultaneous  upgrades  of  multiple  components.  Another  distinction  of  this  work  is  that  it  provides  an 
innovative  dynamic  assume-guarantee  reasoning  framework  for  the  compatibility  check.  The  dynamic 
nature  of  the  compatibility  check  allows  reusing  previously  computed  assumptions  to  prove  or 
disprove  the  global  properties  of  the  updated  system. 

Additionally,  this  report  gives  a  new  solution  to  the  containment-check  problem  presented  by  Chaki 
and  colleagues  [Chaki  04b].  In  our  earlier  work,  the  containment  step  is  solved  using  learning 
techniques  for  regular  sets  and  handles  finite-state  systems  only.  In  contrast,  the  new  approach  is 
extended  to  handle  infinite-state  C  programs.  Moreover,  this  report  defines  a  new  technique  based  on 
the  simultaneous  use  of  overapproximations  and  underapproximations  obtained  via  existential  and 
universal  abstractions. 
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5  Conclusion 


This  report  presents  results  of  the  SEI IRAD  project  on  verification  of  evolving  software  via 
component-substitutability  analysis.  It  addresses  a  critical  and  vital  problem  of 
component-substitutability  analysis  and  provides  a  solution  that  consists  of  two  phases:  (1) 
containment  and  (2)  compatibility  checks.  The  compatibility  check  performs  compositional  reasoning 
with  help  of  a  dynamic  regular  language-inference  algorithm  and  a  model  checker.  Our  experiments 
confirm  that  the  dynamic  approach  is  more  effective  than  complete  revalidation  of  the  system  after  an 
upgrade.  The  containment  check  detects  behaviors  that  were  present  in  each  component  before,  but 
not  after,  the  upgrade.  These  behaviors  are  used  to  construct  useful  feedback  to  the  developers.  We 
observed  that  the  order  of  components  used  to  discharge  the  assume-guarantee  rules  has  a  significant 
impact  on  the  algorithm  runtimes  and,  hence,  needs  investigation.  We  would  further  like  to  investigate 
a  modification  of  DynamicCheck  based  on  a  more  efficient  L*  algorithm  by  Rivest  and 
colleagues  [Rivest  93]  to  improve  its  performance. 

The  component-substitutability  analysis  has  been  implemented  in  the  COPPER  tool  [Chaki  05c]  that 
can  be  invoked  within  the  ComFoRT  framework.  The  verification  framework  was  validated  on  an 
industrial  benchmark  provided  by  our  industrial  partner,  ABB  [ABB  05],  and  the  framework 
demonstrated  encouraging  results. 
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